4 Binary file formats (BFF)

4.1 Abstraction

The general structure of a BFF object can be seen to be made up by the following abstraction:

Most BFF objects can be mapped to the general model in Fig. 7. Information regarding sections, symbol table etc. usually are identified within the file header. An example of a BFF that cannot easily determine the four object structures is the DOS EXE executable file format. In a DOS EXE file, the file header contains information about the relocation table, but no information about where the symbol table can be located. The DOS EXE only has one section and is not possible to determine where the code, the data and symbol information are located within this section without disassembling code. The rest of this chapter gives examples of some common BFF.

4.2 BFF examples

4.2.1 Intel MZ DOS EXE format

The old EXE binaries are executed under the MS-DOS environment. Different to the earlier DOS COM file which can only be 64 Kb in size, EXE files can use multiple segments. The structure of the DOS EXE is displayed in Fig. 8.


An EXE file consists of a file header, a relocation table and the binary code. Below is a hex dump of the first 64 bytes of the file header from a DOS EXE hello world program. The total size of this program is 6432 bytes. The first column is address offset relative to the beginning of the file. Column 2 to 17 are bytes of data (in hexadecimal) for the program. The last column is the ASCII representation of the bytes in columns 2 to 17.


00000000: 4D 5A 2E 01 0D 00 01 00 - 20 00 0D 00 FF FF 77 01 MZ...... .....w. 00000010: 80 00 00 00 00 00 00 00 - 3E 00 00 00 01 00 FB 50 ........>......P 00000020: 6A 72 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 jr.............. 00000030: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 01 00 ................

The first two bytes - "MZ" of the header (in bold) is the DOS EXE signature. Followed by the last page size in clusters - 512 byte page. The next two words are the size of the file in 512 byte pages - "0D", and the number of relocation entries - "01". The relocation table starts after the file header. The complete format including the structure of the header and relocation table for the DOS EXE format can be found in appendix 1a.

4.2.2 Windows 3.x NE (New executable) EXE format

The Windows new executable (NE), also known as the segmented executable is used by the Microsoft Windows operating system. This BFF is defined for Windows applications and dynamic-link libraries (DLLs). The NE is an extension to the old MSDOS EXE BFF, it contains extra information for Windows code, data and resources. There are two headers for the NE: the old-style MSDOS header and the new segmented header. Fig. 9 shows the structure of the NE format.


4.2.2.1 Old-style file header

The old MS-DOS file header contains information for a MS-DOS executable file. The top four parts of the NE BFF describes a stub MS-DOS program. If this file is ran in real mode MS-DOS (without Windows) the stub program is executed, usually displaying this message - "This program must be run under Microsoft Windows". The relative byte offset to the stub program's relocation table is located at offset 18h. If this value is 40h, then the offset to the new EXE header is at 3Ch.

4.2.2.2 New EXE header

The Windows new segmented header contains information for the loader in Microsoft Windows. This header is recognise as the new segment header if it contains the signature word - "NE". The new EXE header contains information like the linker version number, entry table offsets, flags, and others. The Windows loader copies this header into the system's module table. The module table is a place where information are manage and provide support for dynamic linking.

4.2.2.3 Hex dump of a Windows "Hello world" NE.

Below is the first 304 bytes of a dynamically linked "Hello World" program in Windows NE format. The total number of bytes for this program is 16384 bytes.


00000000: 4D 5A 50 00 02 00 00 00 - 04 00 0F 00 FF FF 00 00 MZP............. 00000010: B8 00 00 00 00 00 00 00 - 40 00 00 00 00 00 00 00 ........@....... 00000020: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 00000030: 00 00 00 00 00 00 00 00 - 00 00 00 00 90 00 00 00 ................ 00000040: BA 10 00 0E 1F B4 09 CD - 21 B8 01 4C CD 21 90 90 .......!..L.!... 00000050: 54 68 69 73 20 70 72 6F - 67 72 61 6D 20 6D 75 73 This program mus 00000060: 74 20 62 65 20 72 75 6E - 20 75 6E 64 65 72 20 4D t be run under M 00000070: 69 63 72 6F 73 6F 66 74 - 20 57 69 6E 64 6F 77 73 icrosoft Windows 00000080: 2E 0D 0A 24 00 00 00 00 - 00 00 00 00 00 00 00 00 ...$............ 00000090: 4E 45 05 0A 94 00 0A 00 - 00 00 00 00 0A 00 02 00 NE.............. 000000A0: 00 10 00 14 00 00 01 00 - 00 00 02 00 02 00 04 00 ................ 000000B0: 0D 00 40 00 50 00 50 00 - 72 00 7A 00 2E 01 00 00 ..@.P.P.r.z..... 000000C0: 01 00 09 00 00 00 02 00 - 00 00 00 00 00 00 00 03 ................ 000000D0: 01 00 F5 24 50 1D F5 24 - 15 00 D8 05 51 0D D8 05 ...$P..$....Q... 000000E0: 05 48 45 4C 4C 4F 00 00 - 16 40 5F 45 41 53 59 57 .HELLO...@_EASYW 000000F0: 49 4E 50 52 4F 43 24 51 - 55 49 55 49 55 49 4C 01 INPROC$QUIUIUIL. 00000100: 00 00 01 00 08 00 0D 00 - 11 00 00 06 4B 45 52 4E ............KERN 00000110: 45 4C 04 55 53 45 52 03 - 47 44 49 08 4B 45 59 42 EL.USER.GDI.KEYB 00000120: 4F 41 52 44 01 FF 01 CD - 3F 01 0E 1D 00 00 09 48 OARD....?......H 00000130: 45 4C 4C 4F 2E 45 58 45 - 00 00 00 00 00 00 00 00 ELLO.EXE........

Offset 00h to 10h is the old MS-DOS header. At location 18h, the value is 40h indicating the address of the relocation table and MS-DOS stub program. At 40h, this is the stub MS-DOS program and you can see its message on the right. The value of 3Ch is 90h - the location of the new segmented header with a valid signature - "NE" (4E and 45). Segment table starts at D0h, resident name table at E0h etc. For a complete description of the Windows NE BFF, see appendix 1b.

4.2.3 Executable and linking format (ELF)

The executable and linking format (ELF) [23] [24] is organised in a very general structure. This BFF can be used in Sun Sparcs as well as Intel x86 machines running the Solaris operating system. The ELF provide a parallel view of the file's contents : one for program linking and the other for program execution. Fig. 10 shows the structure of an ELF BFF:


Segments are made up of one or more sections. Sections holds the object file information during linking: code, data, relocation table, dynamic linking information etc. The section header table contains information about each of the sections for the object file. Each section entry in the section header table has information such as the section name, section size, section type and so on. A program header table describes information used by the system when creating the process image. The format of the ELF is very extensive. Its structures contains the most comprehensive list of options than most of the other BFFs. It comprise compositions found in many BFFs including dynamic symbol tables, relocation tables, hash tables and so forth. Because of this and thus its portability (runs on Sparc and x86), it is used as a general frame model for BFFs in the implementation of the SRL. Below is the hex dump of the file header and the string table for the same "Hello World" program in a (Sparc, Solaris, ELF) binary object:

      00000000: 7F 45 4C 46 01 02 01 00 - 00 00 00 00 00 00 00 00 ?ELF............
      00000010: 00 02 00 02 00 00 00 01 - 00 01 06 58 00 00 00 34 ...........X...4
      00000020: 00 00 10 BC 00 00 00 00 - 00 34 00 20 00 05 00 28 .........4. ...(
      00000030: 00 19 00 17 00 00 00 06 - 00 00 00 34 00 01 00 34 ...........4...4
      00000040: 00 00 00 00 00 00 00 A0 - 00 00 00 A0 00 00 00 05 ................


      00000C40: 00 00 01 49 00 01 07 38 - 00 00 00 38 12 00 00 0A ...I...8...8....
      00000C50: 00 61 2E 6F 75 74 00 63 - 72 74 69 2E 73 00 5F 65 .a.out.crti.s._e
      00000C60: 78 5F 74 65 78 74 30 00 - 5F 65 78 5F 72 61 6E 67 x_text0._ex_rang
      00000C70: 65 30 00 5F 65 78 5F 73 - 68 61 72 65 64 30 00 63 e0._ex_shared0.c
      00000C80: 72 74 31 2E 73 00 76 61 - 6C 75 65 73 2D 58 74 2E rt1.s.values-Xt.
      00000C90: 63 00 74 65 73 74 2E 63 - 00 63 72 74 6E 2E 73 00 c.test.c.crtn.s.
      00000CA0: 5F 65 78 5F 74 65 78 74 - 31 00 5F 65 78 5F 72 61 _ex_text1._ex_ra
      00000CB0: 6E 67 65 31 00 5F 65 78 - 5F 73 68 61 72 65 64 31 nge1._ex_shared1
      00000CC0: 00 5F 73 74 61 72 74 00 - 5F 65 6E 76 69 72 6F 6E ._start._environ
      00000CD0: 00 5F 65 6E 64 00 5F 65 - 78 5F 72 65 67 69 73 74 ._end._ex_regist
      00000CE0: 65 72 00 5F 47 4C 4F 42 - 41 4C 5F 4F 46 46 53 45 er._GLOBAL_OFFSE
      00000CF0: 54 5F 54 41 42 4C 45 5F - 00 61 74 65 78 69 74 00 T_TABLE_.atexit.
      00000D00: 65 78 69 74 00 5F 69 6E - 69 74 00 5F 44 59 4E 41 exit._init._DYNA
      00000D10: 4D 49 43 00 70 72 69 6E - 74 66 00 5F 65 78 69 74 MIC.printf._exit
      00000D20: 00 5F 65 78 5F 64 65 72 - 65 67 69 73 74 65 72 00 ._ex_deregister.
      00000D30: 65 6E 76 69 72 6F 6E 00 - 5F 5F 63 67 38 39 5F 75 environ.__cg89_u
      00000D40: 73 65 64 00 5F 5F 63 67 - 39 32 5F 75 73 65 64 00 sed.__cg92_used.
      00000D50: 5F 5F 66 6E 6F 6E 73 74 - 64 5F 75 73 65 64 00 5F __fnonstd_used._
      00000D60: 50 52 4F 43 45 44 55 52 - 45 5F 4C 49 4E 4B 41 47 PROCEDURE_LINKAG
      00000D70: 45 5F 54 41 42 4C 45 5F - 00 5F 65 64 61 74 61 00 E_TABLE_._edata.
      00000D80: 5F 65 74 65 78 74 00 5F - 6C 69 62 5F 76 65 72 73 _etext._lib_vers
      00000D90: 69 6F 6E 00 6D 61 69 6E - 00 5F 66 69 6E 69 00 00 ion.main._fini..
      00000DA0: 00 00 00 01 00 00 00 02 - 00 00 00 DA 00 00 00 0D ................

The top most region 00h-33h is the ELF header indicated by an "ELF" signature starting at 00h. The region from C50h-D9Fh is the strings table, each string in the table are null terminated. The first and last byte of the string table is a null character.

The total number of bytes for this program is 5280.